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CROSSBRED OR m'BRlD 
Progeny that result from the cross 
of two parental tines or breeds. 

QUANTnxnVE TRAIT lOQ 
(QTL)- Genetic loci or 
chromosomal regioiu that 

contribute to variability in 
complex quantitative traits (such 
as plant height or body weight), 
as identified by statistical analysts. 
Quantitative traiu arc typically 
aflfeaed by several genes, and die 
envtronmcnl. 
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THE USE OF MOLECULAR GENETICS 
IN THE IMPROVEMENT OF 
AGRICULTURAL POPULATIONS 



Jack C. M. Dekkers"^ and Frederic HospitaP 

Substantial advances have been made in the genetic Improvement of agriculturally important 
animal and plant populations through artificial selection on quantitative traits. Most of this 
selection has been on the basis of observable phenotype, without knowledge of the genetic 
architecture of the selected characteristics. However, continuing molecular genetic analysis of 
traits in animal and plant populations is leading to a better understanding of quantitative trait 
genetics. The genes and genetic markers that are being discovered can be used to enhance the 
genetic improvement of breeding stock through marker-assisted selection. 



' MULT I FACTOR lAL GENETICS © 
Genetic improvement through artificial selection has 
been an important contributor to the enormous 
advances in productivity that have been achieved over 
the past 50 years in plant and animal species that are of 
agricultural importance (nc. i). Most of the traits that 
are selected on are complex quantitative traits, which 
means that they are controlled by several genes, along 
with environmental factors, and that the underlying 
genes have quantitative effects on phenotype. So far, 
most selection has been on the basis of observable phe- 
notype, which represents the coUective effect of all genes 
and the environment 

Sophisticated testing and selection strategies have 
been developed and implemented for many species, with 
the aim of improving the genetic performance in a breed 
or line through recurrent selection or introgression 
(BOX 1 ). Another goal is to develop superior crossbredsor 
HYBRIDS through the combination of several improved 
lines or breeds. Until recently, these selection pro- 
grammes were conducted without any knowledge of the 
genetic architecture of the selected trait. Andersson' and 
Mauricio' recently reviewed how molecular genetics is 
used to discern the genetic nature of quantitative traits in 
animal and plant species, respectively, by identifying 
genes or chromosomal regions that affect the trait — so- 
called QUANTrfATivETRArrLoci (QTL). The purpose of this 



article is to show how this information can be used to 
enhance genetic improvement of agriculturally impor- 
tant species. Our emphasis is on the use of natural varia- 
tion in a species, rather than on the introduction of new 
genetic variation through-genetic modification, although 
some of the programmes reviewed, such as introgres- 
sion, are also important in the introduction of trans- 
genes into breeding populations. 

The quantitative genetic approach 

The quantitative genetic approach to selection is based 
on knowledge of population genetic parameters for the 
traits of interest, such as HERrrABiLrriES, genetic variances 
and GENETIC coRRELAT}ONs^ These parameters can be esti- 
mated using statistical analysis of phenotypic data from 
pedigrees*. However, the genetic architecture of the trait 
itself is treated as a black box, with no knowledge of the 
number of genes that affect the trait, let alone of the 
effects of each gene or their locations in the genome. 
More specifically, quantitative genetic theory is based on 
Fisher's infinitesimal genetic model*, in which the trait is 
assumed to be determined by an infinite number of 
genes, each with an infinitesimally small effect. On the 
basis of this model, the expected increase in mean per- 
formance of a population per generation through 
genetic selection is proportional to the accuracy with 
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Figure 1 1 Examples of genetic improvements in livestock and crops, a } Average milk 
production per lactation of US Holstein cows has nearty doubled during the past 40 years, as 
shown by the top line (phenotypic yield) (Animal Improvement Programs Laboratory; 
ftpy/aipl,arsusda.90v/pub/trend/tnd1 1 .H). More than haW of this has been due to Improved 
genetics, as shown by the bottom Bne. which ptots the progression of the population average 
genetic value for milk yield, b | For com, yields have increased fourfold during the past 60 years 
(httpy/wvm.usda.gov/nass/aggraphs/comyW.htm), Although yields have fluctuated from year to 
year, primarily due to weather, there has been a consistently increasing trend, as shown by the 
regression line (dashed). Again, more than half of this increased yield has been a result of genetic 
improvement*^ 



HERrTABIUTY 

The fraction of the pbcnorypk 
variance that is due to additirvc 
genetic variance. 

GENETIC VARIANCE 
Variation in a trait in a 
population that is catued by 
genetic differences. 

GENETIC CORRELATION 
The correbtion between traits 
that is caused by genetic as 
opposed to environmental 
factors. A genetic correlation 
between two traits results if the 
same gene affects both traits 
(pleiotfopy) or if genes that 
affect the wo traits are in linkage 
disequilibrium. 



which the breeding value of selection candidates can be 
estimated, the intensity of selection and the genetic vari- 
ation in the population (see also the review by Barton 
and Keightley on p.l 1 of this issue). 

Despite the obvious flaws of the infinitesimal model, 
the tremendous rates of genetic improvement that have 
been achieved (FIG. i) attest to the usefulness of the 
quantitative genetic approach. Nevertheless, quantita- 
tive genetic selection has several limitations, due to the 
phenotype being an imperfect predictor of the breeding 
value of an individual, possibly unobservable in both 
genders or before the time when selection decisions 
must be made, and not very effective in resoh^ing nega- 
tive associations between genes, such as those caused by 
linkage or epistasis.The ideal situation for quantitative 
genetic selection is that the trait has high heritability and 



that the phenotype can be observed in all individuals 
before reproductive age. This ideal is hardly ever 
achieved (TABLE I), which limits the effectiveness of 
quantitative genetic selection. However, because DNA 
can be obtained at any age and from both genders, mol- 
ecular genetics can alleviate some of these limitations, as 
will be discussed below. 

Whereas selection in breeding populations primarily 
focuses on additive genetic effects, the non-additive 
effe.cts of HETEROSIS OR HYBRID VIGOUR, which are observed 
when lines or breeds are crossed, have also contributed 
greatly to the performance of livestock and crops. In the 
absence of any molecular data, breeding programmes 
that are aimed at producing new and improved hybrids 
or crossbreds are largely based on. extensive testing by 
trial and error. In plants, lines have been placed in a lim- 
ited number of heterotic groups, which, when crossed, 
typicaDy result in substantial hybrid vigour. 

How can molecular genetics help? 

Molecular genetic analyses of quantitative traits lead to 
the identification of two broadly different types of 
genetic loci that can be used to enhance genetic 
improvement programmes: causal mutations and pre- 
sumed non- functional genetic markers that are linked 
to QTL (indirect markers). Causal mutations for quan- 
titative traits are hard to find, difficult to prove and few 
examples are available'. By contrast, non-functional or 
anonymous polymorphisms are abundant across the 
genome and their linkage with QTL can be established 
by evidence of empirical associations of marker geno- 
types with trait phenotype. Two approaches are used to 
identify indirect markers': directed searches using can- 
didate-gene approaches in unstructured populations*; 
and genome-wide searches in specialized populations, 
such as crosses. Because candidate-gene markers 
focus on polymorphisms in a gene that are postulated 
to affect the trait, they are often tightly linked to the 
QTL. A candidate-gene marker can occasionally repre- 
sent the functional variant itself, although this is difficult 
to prove'. Genome scans, conversely, can only identify 
regions of chromosomes that affect the trait. The length 
of these regions is typically 10-20 cM, but the exact 
position and number of QTL in the region is unknown. 

Whereas causative polymorphisms give direct infor- 
mation about genotype for the QTL, the use of indirect 
markers for QTL mapping and for selection is based on 
the existence of linkage disequilibrium (LD) between the 
marker and the QTL. Marker-QTL LD can exist at the 
population level but always exists within families, even 
between loosely linked loci (BOX 2). Although two loci are 
expected to be in population-wide equilibrium in large 
random-mating populations, partial population-wide 
LD can exist by chance between tightly linked loci in 
breeding populations that are under selection. 
Population -wide LD can also be created by crossing lines 
or breeds. Although LD will then exist even between 
loosely linked loci, this LD wiD erode rapidly over gener- 
ations. Indirect markers that are identified using the can- 
didate-gene approach are expected to be in substantial 
LD with the QTL with which they are associated. Unless 
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Box 1 1 Genetic improvement of agriculturai species 

Two important strategies for genetic improvement are recurrent selection and 
introgression programmes. The aim of a recurrent selection programme, which is the 
main vehicle for genetic improvement in livestock, is to improve a breed or line as a 
source of superior germplasm for commercial production through within-breed or 
within-Iine selection (part a of figure). This involves recording the phenotypes of 
numerous individuals and the use of these phenotypes to estimate the 'breeding 
value' of selection candidates. An example of performance testing is the progeny test» 
in which the breeding values are estimated on the basis of the phenotype of progeny 
that have been created through test matings. Test matings can be to individuals of the 
same breed or line if the aim is to improve pure-bred performance^ or to individuals 
from another breed or line if the objective is to improve crossbred or hybrid 
performance. Improvement of stock for commercial production often involves 
further product development through testing, breeding or crossing to generate 
crossbreds or hybrids. 

Introgression is another important genetic improvement strategy, in particular in 
plants (part b of figure). The aim of an introgression programme is to introduce a 
Harget' gene, which can be a single gene, a quantitative trait locus or a transgenic 
construct, from an otherwise low-productivity line or breed (donor) into a 
productive line that lacks that particular gene (recipient; R). Introgression starts by 
crossing the donor and recipient lines, followed by repeated backcrosses (BC) to the 
recipient line to recover the recipient-line genome. The target gene is maintained in 
the backcross generations through selection of donor gene carriers. Recovery of the 
recipient genome can be enhanced by the selection of backcross individuals that have 
a high value for the recipient trait phenotype. Note that genetic improvement for this 
trait can be maintained by continuing recurrent selection in the recipient line 
(vertical arrows). Once a sufficient proportion of the recipient genome is recovered, 
the backcross line is intercrossed (to generate IC lines), and donor gene homozygotes 
are selected 10 fix the target gene. This might require more than one generation to 
obtain sufficient individuals for further breeding or if several target genes must be 
introgressed. The effectiveness of introgression schemes is limited by the ability to 
identify backcross or intercross individuals with the target gene and by the ability to 
identify backcross individuals that have a high proportion of the recipient genome, 
in particular in regions around the target gene^. 
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the functional polymorphism has been identified, how- 
ever, UNKAGE PHASE of a candidate-gene marker with the 
functional variant can differ fi-om one population to the 
next and must, therefore, be assessed in the population 
in which it will be used. Although more abundant and 
extensive, within-family LD is more difficult to use 
because linkage phases between the markers and QTL* 
will not be the same in all families and must, therefore, 
be assessed on a within-famiJy basis. 

The use of molecular genetics in selection pro- 
grammes rests on the ability to determine the genotype 
of individuals for causal mutations or indirect markers 
using DNA analysis. This information is then used to 
assess the genetic value of the individual, which can be 
captured in a molecuur score that can be used for selec- 
tion. This removes some of the limitations of quantitar 
live genetic selection discussed above (table i). 

It is clear that the use of molecular data for genetic 
improvement would be most effective if the genetic 
architecture of a quantitative trait was completely 
transparent, such that we knew the number, the posi- 
tions and the effects of all the genes involved. In that 
case, the process of selection would be reduced to a 
simple 'building block* problem (genotype building) 
of selection and mating to create individuals with the 
right combination of alleles at each QTL. However, 
this situation is far from reality and might never be 
achieved; although advances in molecular genetics 
have been able to partially explain the 'black box' of 
quantitative traits, the information provided by mole- 
cular data is incomplete, for three main reasons. First, 
in most cases, only a limited number of genes that 
affect the trait has been identified, albeit the ones with 
larger effects. A substantial part of the black box there- 
fore remains obscure, and selection exclusively on 
genotype for identified QTL would not result in a 
maximum response to selection. Instead, selection on 
molecular score must be combined with selection on 
phenotype, which reflects the collective action of all 
genes, including those that have not been identified. 
Second, with indirect markers, selection is not directly 
on the QTL, but on the marker, through LD. As LD 
erodes in the course of the selection programme 
owing to recombination, the efficiency of selection is 
reduced. Third, for both causal and indirect markers, 
the effects of the QTL must be estimated empirically 
on the basis of statistical associations between markers 
and phenotype. So, the use of molecular information 
does not remove the need for phenotypic information 
and, therefore, suffers to some degree from the same 
limitations as quantitative genetic selection. 

Applicalion of molecular data 

.Despite the limitations outlined above, molecular 
genetic information can be used to enhance several 
breeding strategies through what is broadly referred 
to as marker-assisted selection (MAS). All strategies 
for MAS are based on the use of a molecular score, 
although the composition of this score differs from 
application to application (TABLE 2). In addition to 
those described below, the applications of molecular 
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Table 1 1 Limitations of quantitative genetic selection and opportunities for the use of molecular data 



Limit to quantitative 
selection 

Phenotype is a poor 
predictor of breeding 
value (tow heritability). 

Phenotype is difficult or 
^q^ensrve to record. 

Phenotype expressed 
subsequent to 
reproductive age. Long 
generation interval. 

Individual has to be 
sacrificed to score its 
phenotype. 

Traits observed only in 
one gender 



Genetic potential Is 
masked by epistatic 
interactbns between 
QU. or by linked QTL 
that are in repulsion 



Example trait{s) 

Reproduction in 
animals, yield in 
plants. 

Disease-related traits. 



Reproduction traits in 
animals, grain yield 
in plants. 
Tree breeding. 

Meat quafity in 
animals, malting 
quality in barley. 

Milk yield in dairy 
cattle. 



Genotype-envirDnment 
interactions. 



Many traits. 



Many traits. 



Hielp provkted by 
molecular d^ta 

Better estimate of 
breeding value at 
identified QTL. 

Markers are easier or. 
cheaper to scoie than 
phenotype. 

Molecular score is 
available at earlier 
stage, resulting in 
faster selection. 

Molecular score is 
available on all 
selection candfclates. 

Molecular score is 
available at an early 
age on both genders. 



Dissect and break 
down unfavourable 
interactions at the 
genetic level. 



Predict interactions 
at the ger)etic level. 



Possible breeding solution 

Select on molecular score 
and phenotype. 

Select on molecuiar score. 



Select on molecular score in 
combination with phenotype 
of arK»slors. 

Select on molecular score in . 
combination with phenotype 
of relatives. 

Select on molecular score 
and phenotype. Pre-select on 
molecular score for further 
phenotypic testing (for 
example, progeny test). 

Select on molecutar score 
and phenotype. 



Select on molecular score 
and phenotype. 



Economic merit of 
molecular data 

Depends on requirements 
for QTL detectfon. 
Difficult to prove. 

Proportional to cost of 
phenotyping versus 
genotyping. Easy to prove. 

Allows more rapid genetic 
gain and earlier release of 
improved genetic material. 

Substantial increase in 
genetic gain expected. 

Moderate, depending on 
opportunity for and costs of 
pre-seiection, 



Drfficuft but can be 
spectacular if successful. 



UnkfKJwn. 
Difficult to prove. 



QTL, quantitatNe trait locus. 



BREEDING VALUE 
A measure of the value of an 
individud for breeding purposes, 
as assessed by the mean 
performance of its prc^eny. 

HETEROSIS OR HYBRID VIGOUR 
When a hybrid or crossbred 
individual has a higher 
performance than the average of 
its two parents (the animal 
breeding definition), or than the 
best parent (the plant breeding 
definition). This is the result of 
non-additive actions of genes 
((over-)donunancr and/or 
epistasis). 

LINKAGE DISEQUILIBRIUM 
(LD). The condition in which the 
frequenq'of a particular 
haplorjrpe for two loci is 
significantly different from that • 
expected under random mating. 
The expected frequency b the 
product of ob$er\'cd ailehc 
frequencies al each locus. 

LINKAGE PHASE 

The arrar^ement of alleles at two 
loci on homobgDus 
chromosomes. For example, in a 
dif^td indhridual with genotype 
Mm at a marker locus and 
genotype Qq at a quantitati\'e 
trait bcus, possible linkage 
phases are MQ/mq and Mq/mQ, 
for vdiidi '/'separates the two 
homologous chromosomes. 



data in genetic programmes include their use for 
parentage verification or identification (for example, 
when mixed semen is used in artificial insemination), 
and in genetic conservation programmes to identify 
unique genetic resources and quantify genetic 
diversity. 

Genotype building programmes. If many. QTL are 
knov^rn, and favourable alleles are present in different 
lines or breeds, genotype building strategies can be used 
to design new genotypes that combine favourable alleles 
at aU loci. Selection is then based on the molecular score 
alone, which is determined by the genotype at those loci 
(possibly estimated through indirect markers), along 
with (if possible) information on linkage and linkage 
phase between those loci. Starting from a cross between 
two parental lines, the simplest genotype building strat- 
egy involves screening a population for individuals that 
are homozygous at the relevant loci'. More than one 
generation of mating and selection might be needed to 
produce individuals that are homozygous for a larger 
number of loci'-'. In certain crop species, double-haploid 
(DH) LINES are used, which provide homozygous recombi- 
nant genotypes in a single step, but these are not avail- 
able in animals. 

When more than two parental lines are involved, gene 
pyramiding can be used to create individuals that are 
homozygous at aU loci. Gene pyramiding involves multi- 
ple initial crosses between several parents (FIG. 2). Because 
the above strategies involve several generations of specific 
matings and the production of numerous oi^ring, they 
are more applicable to plants than animals. 



Introgression programmes. Introgression is a simple form 
of genotype building, in which a target gene is introduced 
into an otherwise productive, recipient line (BOX i). 
Molecular markers can be used in both the backcrossing 
and the intercrossing phases of such programmes. The 
effectiveness of the backcrossing phase can be increased in 
two ways (TABLE 2): by identifying carriers of the target 
gene (foreground selection); and by enhancing recovery 
of the recipient genetic background (background selec- 
tion). Strategies for foreground and bad^ound selection 
have been the subject of several publications (for a recent 
review, see REF. lo). During the intercrossing phase, mark- 
ers can be used to select individuals that are homozygous 
for the target gene. For multiple QTL, introgression can 
be combined with gene pyramiding to decrease the num- 
ber of individuals required' ' '^ 

In addition to requiring extra resources, an intro- 
gression programme diverts some selection pressure 
away from other traits of economic importance. To 
compensate for this, the benefit of the target gene must 
be greater than that which could be achieved by regular 
selection over the same period. Only genes with a large 
effect wiU meet this requirement". 

Recurrent selection programmes. For a single marker, the 
molecular score of an individual for use in recurrent 
selection is obtained as the estimate of the statistical 
association between marker genotype and phenotype 
(TABLE 2). For multiple markers, genotype effects can be 
summed over all markers into a single molecular score'*. 
In addition to the molecular score, phenotypic informa- 
tion will be available on the selection candidate itself 
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Box 2 1 Selection programmes based on linkage disequilibrium 



Markers that are tightly linked to a quantitative trait locus (QTL) can be in complete or 
partial population -wide linkage disequilibrium (LD) with the QTL, such that some 
marker^QTL haplotypes are more frequent than expected by chance (for example, MQ 
and mq versus Mq and mQ) (part a of figure). In this case, selection can be directly on 
marker genotype. The probability of population -wide LD is higher for closely linked 
markers and in selected populations of small effective size, which is the case for 
agricultural species*'. Population -wide LD can also be created by crossing (ideally inbred) 
lines or breeds and will then exist between loosely linked markers for several generations 
(part b of figure). When a marker and a QTL are in linkage equilibrium, all marker-QTL 
haplotypes are present and at random-mating frequencies, and marker genotype gives no 
information about QTL genotype (part c of figure). This will be the case for most linked 
markers in an outbreeding population. However, the marker and QTL will be in partial 
disequilibrium within a family. The extent of within-family disequilibrium depends on 
the recombination rate (r). but will occur even with loose finkage (for example, r = 0.2). 
This disequilibrium can be used to detect QTL and for selection on a within-family baas. 



Wrthtn-fam9y 
dtsequiibrium 




MOLECULAR SCORE 

A score thai quantifies the vahie 
of an individual for selection 
purposes derived on the basis of 
indecuIaT genetic data 

PROGENYTEmNG 

Evaluat ion of the breedii^ vahie 
of an individiial based on the 
mean performance of its prc^ny. 

DOUBLE- HAPUHD LINE 

(DH line). A population of fiiSy 

homozjigous individuals that is 
obtained by artifkiaDy 'doubling 
the pmetes produced by an F, 
hybrid. 

BACKCKOSS 

Crossing a ciossbied population 
back to one of its parents. 



and/or its relatives. Given these alternative sources of 
information, three strategies for the selection of candi- 
dates (oT breeding can be distinguished selection on mol- 
ecular score abne; selection on molecular score folbwed 
fay selection on phenotyp^ and combined selection on an 
index of the molecular score and the phenotype. 

Selection on molecular score alone will result in less 
genetic improvement than combined selection on mol- 
ecular score and phenotype, unless the molecular score 
captures all genetic variation or the phenotypic records 
provide no information to differentiate selection candi- 
dates. A prime example of the latter is when one or 
more members from a family must be selected before it 
is possible to collect phenotypic information that allows 
their breeding values to be differentiated (FIG. 3). This 
provides ideal opportimities for MAS because markers 
are used at a stage of the continuing selection pro- 
gramme that is underused, as quantitative selection at 
that stage is ineffective. So, apart from the extra resource 
requirements, this is a rather risk- free approach, with 
limited impact on response to quantitative genetic selec- 
tion. Opportunities to use MAS in this manner are cru- 
cially influenced by reproductive rates (see below). 



If informative phenotypic data are available along 
with molecular data, selection on a combination of 
molecular score and phenotypic information is the 
most powerful strategy. Methods to derive an index for 
combined selection were developed by Lande and 
Thompson*^ using selection index theory. The index 
optimally weights molecular score and phenotypic data 
such that the accuracy of the index as a predictor of the 
selection candidate's breeding value is maximized. 
Combined selection is most effective when phenotypic 
information is limited because of low beritability or 
in^ility to record the phenotype on all selection candi- 
dates before selection'^ The paradox is that the abihty to 
detect QTL, which also requires phenotypic data, is also 
Hmited for such cases". So, unless different resources or 
strategies are used for QTL detection, the greatest 
opportunities for MAS might exist for traits with mod- 
erate rather than low beritability. 

Crossbred or hybrid performance. In theory, crosses 
between Unes that are genetically distant are expected to 
show greater hybrid vigour or heterotic effects than those 
between more ck)sety related lines, because differences in 
allele frequencies between genetically distant lines are 
expected to be greater. Genetic distance can be measured 
from differences in allele frequencies at anonymous 
markers spread throughout the genome. Evaluation of 
this concept for many crops'^ shows that marker-based 
prediction of hybrid performance can be efficient if 
hybrids inchide crosses between lines that are related by 
pedigree or v^ich trace back to common ancestral popu- 
lations. By contrast, prediction is not efficient for crosses 
between Ones that are unrelated or that originated from 
different populations, because the associations (through 
LD) between marker loci and QTL that are involved in 
heterosis are not the same in the different populations'*. 

The limited ability to predict hybrid vigour in 
untested crosses has motivated the development of 
strategies that use the knowledge of QTL effects to gen- 
erate crosses that are predicted to aeate QTL genotypes 
with favourable non-additive effects. An example is the 
use of marker-based statistical methods to predict the 
performance of untested crosses from the performance 
of parental lines in a limited number of test crosses'^ 

State of the art 

In contrast to the past decades, when almost no markers 
were available and breeding was mostly based on selec- 
tion on phenotype, an ideal view of the future could be 
that the location and function of all genes that affect 
quantitative traits are known. Genotype building strate- 
gies could then be appHed directly on those genes and 
tedious phenotype scoring would no longer be neces- 
sary. This, however, assumes that the effects of those 
genes are known with precision and are consistent; for 
example, in different environments and genetic back- 
grounds. Although this is far from the case, some geno- 
type building strategies are already routinely used (at 
least in plants) to manipulate genes of large effect or 
transgenic constructs; for example, in intiogression pro- 
grammes. However, as theoretical and experimental 
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Table 2 1 Strategies 
Programme 

Genotype building 

Pyramicfmg 

Introgression 
Foreground setection 
Background setection 



Intetcroes selection 
Recurrent selection 



Crossbreeding or 
hybrid production 

Choice of breeds or 
liries to cross 



for the use of molecular data in genetic improvement programmes* 

Composition of molecuiarocore 



Infomiation required to 
compute molecular score 



Genotypes at target loc? . 
(Linkage between target loci.) 
(Linkage phases between target lod.) 



Genotypes at target lod* . 

Genotypes at marker toci across 
gerxjme. 

(Linkage between markers.) 
(Linkage with target loci.) 

Genotypes at target lod*. 

Genotypes at QTL* or markers . 
Estimales of QTL or marker effects. 

(Linkage phases between QTL.) 



ABele frequencies at marker \od 

across genome. 

Genotypes at QTL or markers and 
QTL or marker effects. 



Presence-absence of target aHeles. 
(Modified by IHcage phase for linked 
target lod.) 



Presence-alDsenoe of target alleies. 
Proportion of recipient alleles. 

(Proportion of redpient genome.) 
(Greater emphasis on markers Knked 
to target loci.) 

Number of target alleles. 

Sum of effects for genotypes at QTL 

or markers. 

(Modified by linkage phase between 
tightly linked QTL.)§ 



Genetic distance between pairs of 
breeds or Dnes. 

Sum of ejects for predicted genotypes 
at QTL or markers. 



Selection or decision criterion 



Molecular score. 



Molecular score. 

Molecular score. 

index of molecutar score and 

recipient trait phenotype. 



Molecular score. 

Molecular score. 
Molecular score foibwed by 
phenotype. 

Index of rrwlecular score and 
phenotype. 



Molecular score. 
Molecular score. 



•Items in brackets are opilonal. nWs must be derived from Bnked markers i! the JUidiond gene has rral tjeen mapped. 'REF. 64. 



SELECTION INDEX THBOFY 
Hieory of selection that 
combines several traits or 
sources of infomiation, such 
that the accuracy of the indec as 
a predictor of the selection goal 
(for ecanqple, the breeding 
value) is maximized. 



.results of QTL detection have accumulated) the initial 
enthusiasm for the potential genetic gains allowed by 
molecular genetics has been tempered by evidence for 
limits to the precision of the estimates of QTL effects. 
The present mood is one of *cautioiis optimism*^^ 

Today, a database literature search for 'marker- 
assisted selection* provides hundreds of hits, but, in most 
cases, MAS is mentioned only as a future perspective. 
Others have evaluated the potential of MAS using com- 
puter simulation. Overall, there are still few reports of 
successful MAS experinients or applications. Most refer 
to the use of molecular markers in genotype building 
programmes, at various levels of complexity. Successful 
reports include marker-assisted background selection 
with introgression of genes for which the functional 
variant is known, or which have clearly identifiable pbe- 
notypic effects. Examples are the introgression of the Br 
transgene into different maize genetic backgrou^ds^^ of 
the A/;{>( -nuD allele in mice°, and of the nah'*} lurk gene 
in chickens" (FIG.4). Marker-assisted introgression of 
such 'known' genes is now widely used in plants, in par- 
ticular by private plant-breeding companies. However, 
even in this case, more work is needed to optimize the 
information provided by markers, and reduce costs^"^ 
Other reports on genotype building using known genes 
include the 'pyramiding' of several major disease resjs- . 
tance genes in rice^^-^^. Although a good knowledge of 
the spectrum of gene effects is necessary for the pyra- 
miding of multiple resistance genes, it is a proven valu- 
able step towards more durable and stable resistance, 
which could hardly .be achieved without markers. 
Moreover, the use of markers provides a better under- 
standing of interactions between the intrpgressed genes. 



The experience of introgression of QTL using indi- 
rect markers in foreground selection is quite different. 
In general, introgression has resulted in improvement 
of the targeted traits but, with few exceptions (for 
example, see REF. zs), levels of improvement were below 
the expectations based on estimates of QTL effects 
from the detection phase. The reasons for this under- 
performance include inaccurate estimates of QTL loca- 
tion^', QTL that were k>st or not controlled in the pro- 
gramme^^ negative epistatic interactions between 
QTL^', or strong genotype-environment interac- 
tions'^-^^. Similar results were obtained for the intro- 
gression of three QTL for trypanotolerance in mice by 
gene pyramiding'*, which represents the only report of 
marker-assisted foreground selection of QTL in ani- 
mals; the markers proved useful to control the QTL 
genotype diuing the backcrossing phase, but the effects 
of the QTL in the new background were not always 
consistent with those observed during the QTL detec- 
tion phase. 

The general conclusion to be drawn from these 
results is that for complex traits that are controlled by 
several QTL of moderate or low effect, or that are sub- 
ject to high environmental variation, genotype-environ- 
ment interactions, epistasis between QTLorepislasis 
between QTL and the genetic background, it is risky to 
carry out selection solely on the basis of marker effects, 
without confirming the estimated effects by phenotypic 
evaluation. This is true in particular if QTL were initially 
detected in a different population or genetic back- 
ground. 

Although no documented reports are available, 
industrial applications of molecular data in livestock are 
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RECOMBINANT INBRED UNE 

Apopubiion of fully 
homozygous indivkhiak that is 
obtain«d by repeated sdfing 
fiom an F, hybrid, and thai 
comprises -50% of each 
parental genome in difieiem ' 
con^inations. 

NEAR ISOGENIC UNE 

Lines that are genet ical))' 
identicat. eicept for one locus « 
chromosome segment. . 



X / X 



Selection of 
homozy^tes 
fofG1+G2 



Selection of 
bomozygotes 
forG3 + G4 



y Fg, RIL,orDH 



Selection of 
homozygotes for 
G1 +G2 + G3 + G4 



J 



Rgi^e 2 1 Gene pyramidhig. THs example sho^Are how four 
genes (G1 -G4), wfiicht are preseni h lour drtferent Snes 
(L1 -L4), can be combined into a sin^e line r» a two-sfep 
procedure. In the first step, two Ines are developed, wtiich are 
each honrK>zygous for two target genes {61 , G2 andG3, G4), 
by aossing pairs d lir^. Ibis Is followed by construction of F^, 
RECOMBiNAhrr INBRED LINE (RIL), Of double-haplold (DH) 
progeny and selection of homozygotes. In the second step, 
such IndMduals are aossed to produce lines that are 
homozygous for al fou target genes. Sefecticn of 
homozygotes can be on the basis of Enked markers. This 
process can be e^q^anded to rrxxe than four genes by 
e)q»nding the pyramid. 



limited and have maintybeen in the context of recuirent 
selection piogranimes> which are the principaJ vehicles 
for genetic improvement in animals. A mixture of causal 
and indirect markers is used. In swine> the indirect 
markers used were primarily identified by using candi- 
date- gene approaches or positional cloning, whereas in 
dairy cattle, indirect markers identified using genome 
scans are also used. This species difference is partially 
explained by the different strategies that are used for 
QTL detection. In swine, genome scans are primarily 
based on crosses between divergent lines. These identify 
QTL that differ between breeds but have limited direct 
application for within-breed selection. Direct access to 
closed breeding populations has, however, made candi- 
date-gene approaches relatively successful. In dairy cat- 
tle, QTL detection capitalizes on the large half-sib &mily 
sizes that result from extensive use of artificial insemina- 
tion'* This allows genome scans to detect QTL that seg- 
regate within rather than between breeds. 



Most applications of MAS in livestock are geared 
towards cautious use that does not jeopardize the 
genetic gains that can be obtained by conventional 
selection, for example in pre-selection (FiG. 3). Other 
uses are for traits that are difficult to improve by con- 
ventional means because of low heritability (for exam- 
ple> the use of an oestrogen receptor gene marker to 
select for litter size in swine^^), or traits that are difficult 
to record (for example, traits that are related to disease 
resistance or meat quality). 

ChaHenges and future prospects 

Statistical a^ects of MAS. Most applications of genetic 
markers in selection programmes are preceded by an 
analysis aimed at QTL detection, and only QTL that are 
shown to have a significant effect on phenotype are sub- 
sequently used for selection. This raises two important 
statistical issues: the setting of statistical thresholds for 
deciding which QTL to use; and dealing with the inher- 
ent overestimation of QTL effects. 

For QTL detection, very stringent methods are used 
to control the false- positive error rate, as suggested by 
Lander and Kruglyak'^ Several studies have, however, 
shown that greater gains from MAS can be obtained by 
allowing a higher rate of false positives, to increase the 
power to detect QTL effects and reduce the number of 
false-negative results*"'. So, alternative strategies (for 
example, see ref. m) are needed to more adequately bal- 
ance the cost of false-positive against false- negative 
results for MAS. This balance might differ depending 
on the particular application. Thresholds could be low- 
ered even further if proper statistical methods were 
used to account for the degree of uncertainty about 
estimates of QTL effects. For example, Meuwissen et 
aV^ obtained a molecular score with high predictive 
ability on the basis of high-density marker geno typing 
data by using all estimated marker effects, regardless of 
their statistical significance. 

Overestimation of QTL effects has been shown to 
occur both by theory*® " and by experimentation*^ 
(see also the review by Barton and Keightley on p.] 1 
of this issue). Overestimation of QTL effects leads to 
too much emphasis on molecular scores in selection 
relative to phenotypic data, and results in a less than 
optimal response to selection. In part, biases are 
caused by the use of only significant QTL effects, and 
they can be reduced, although not entirely removed*', 
by re-estimation of significant QTL effects in an 
independent sample. A less-biased estimate of QTL 
effects can be obtained using near- isogenic LINES*^ but 
the generation of such lines is a long and difficult 
process. Alternative statistical methods for the analy- 
sis of QTL data that avoid overestimation or reduce 
their impact on selection response are needed (for 
example, see ref. 44). 

A more general point about the statistical aspects of 
MAS is that the existing models and theory do not 
adequately accommodate the more complex genetics 
that underlies quantitative traits. Furthermore, 
although existing quantitative genetic theory provides 
a satisfactory basis to derive selection strategies that 
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maximize response to selection in the short term (one 
or two generations), the theory has been much less 
developed for selection over several generations. This 
was most clearly seen in several simulation studies that 
showed that combined selection on an index of molec- 
ular score and phenotype results in greater genetic gain 




Muftiple ovulation and embiyo transfer 




Pre-seleclion for rnarker M and progeny testing 




R^e 3 1 Marker-a«8i8ted pre-eelection for progeny testing. Because mik production is a 
sex-timiled Ira't , darry bulls go through a progeny lest , In wfiich they are evaluated on the basis ol 
the mitk production of 60-1 00 dauc^ters. After the progeny test, the best buBs »e selected tor * 
widespread use in the population through artificial hsemination. Because ot the high cost 
involved, only a Bmited number of buDs can be progeny tested each year. Selection of bu8s to be 
tested is based on ancestral inf ormaticn , which means that aR mennbers of a fUl-sto family have 
the same eslinr»ted breecing value. MdeaJar scaes wll. however. dHler between ftdl-sibs if they 
inherited dlfeient marker alleles. Through reproductive technology, such as multiple ovUation and 
embryo transfer, seveial bul calves are produced per female and seleclion ol buDs to progeny test 
can be on the basis ot molecular score''-*'. The connbination of marker-assisted pre-selection 
and progeny testiig has a greater chance of prodUdhg hi^y productive animals. 



in the short term; but, in the long term, selection on 
phenotype alone resulted in a greater response to selec- 
tion****, because selection is better distributed over all 
loci*^ A theory to optimize selection on molecular 
score, in combination with phenotype, has been devel> 
oped**"^, but for genetic models and selection strate- 
gies of limited complexity. Further theoretical work is 
needed to accommodate multilocus Mendelian inheri- 
tance and phenomena such as epistasis, genetic back- 
ground effects and interactions between the environ- 
ment and genetics. 

Redesign of breeding programmes. Most applications 
of molecular genetics to breeding programmes have 
attempted to incorporate molecular data into the 
existing programmes. The effective use of molecular 
data might, however, require a complete redesign of 
breeding programmes. For example, in plants, the 
optimal design for MAS is to allocate test resources to 
a single, large population, such that the probability of 
detecting QTL is high, whereas for phenotypic selec- 
tion, the optimum is to have smaller populations in 
several locations to control for environmental varia- 
tion^'. In addition, population structures and statisti- 
cal methods that allow the combination and use of 
QTL information across lines are needed. Other 
changes that are required for plant breeding pro- 
grammes are reviewed by Ribaut and Hoisington". 
Similarly, in animals, strategies are required that inte- 
grate the collection and analysis of phenotypic data 
for QTL detection with the use of this information for 
MAS ( for example, ref. 37). 

Furthermore, breeding strategies must be devel- 
oped that take better advantage of the unique features 
of molecular data. For example, to capitalize on the 
abthty to select on molecular score at an early age, sev- 
eral rapid rounds of selection exclusively on molecular 
score could be conducted. The speed of selection is 
then mainly limited by the reproductive cycle. Such 
programmes have been proposed for plants by 
Hospital et al*\ by incorporating one or two genera- 
tions of off-season selection on molecular score alone, 
and have been shown (by simulation) to increase 
genetic gain greatly. In animals, such strategies are 
cffcrtive only if combined with technologies that break 
the normal reproductive cycle. For example, in several 
livestock species, the technology exists to recover 
oocytes from the female before puberty, as early as 
from the unborn fetus. When combined with in vitro 
fertilization and embryo transfer, this reduces genera- 
tion intervals to several months, compared with at least 
3 years with regular reproduction in cattle". Haley and 
Visscher^ suggested that the time required for one 
generation could be further reduced if meiosis could 
be conducted in vitro. Such technology, combined with 
nuclear transfer, would allow a breeding programme to 
be conducted in the laboratory, without creating ani- 
mals. Although some of this work is at an early stage, it 
is clear that the benefits of MAS will be much greater 
when molecular technology is integrated with repro- 
ductive technologies. 
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Rgure 4 | Introgreeaon of the avian naked neck gene. The autosomal naked neck gene, 
which affects feather dstrixjtion in chickens and makes them more toterant to heat, was 
inlro^essed from ruaf low-bocfy- weight dorxx chickens (two sm^ birds) Into a commercial 
meat- type Corr^h chicken recipient lir>e (two large, vMiite brdsp. Genome- wide markers were 
used to enhance recovery of the recipient line genome, which conveys rapid gowth and high 
body weight. Picture is courtesy ol A. Cahaner, The Hebrew University, Rehovot, Israel. 



The need to fine map quantitative trait loci. The ulti- 
mate aim of molecular genetic studies of quantitative 
genetic variation is to find the genes that influence the 
trait. However, the use of MAS does not require the gene 
to be known, but can be effective with linked markers. 
So, the crucial issue is how closely a QTL must be 
mapped for it to be useful for MAS. 

Several simulation studies have shown that for MAS 
based on within-family LD, informative markers that 
flank a QTL within 5 cM seem adequate". Given that 
markers are not fully infonnative in practice, this can be 
achieved by using hapiotypes of several markers within a 
lO-cM region around the QTL. For example, Spelman 
and Bovenhuis^^ found that a flanking marker interval 
of 5 cM around the QTL achieved --85-90% of the extra 
response over selection without markers, relative to a 
flanking marker interval of 2 cM. 

Although further fine mapping of QTL might pro- 
vide limited benefits for MAS based on within-family 
LD, the occurrence of population-wide LD will increase 
substantially if the markers are more tightly linked to 
the QTL. Selection on markers that are in population- 
wide LD with QTL is much preferred because QTL 
effects and linkage phase can be estimated from popula- 
tion-wide data instead of the K mi led data that would be 
available v/ithin a family". For individual QTL, markers 
or marker hapiotypes within 1 or 2 cM of the causative 
locus might be required for substantial population-v^de 
LD to be present, depending on population size and 
selection history^. 

LD can be exploited at a genome- wide level when 
marker data are available from a high-density marker 
map; for example, with a marker every centiMorgan. 
The potential of using such data was illustrated by 
Meuwissen et who simulated genome- wide data 



HAPimYPE 

The combination of aDdes at 
several loci on a single 
duomosome. For eon^Ie, for a 
marker with aDelesM ant in 
that is (inked to a quantitative 
trail locus with aDeles Q and q, 
po&siUe hapiotypes are MQ, 
Mq, mQ and mq. 

EFFECTIVE POPULATION SIZE 

The site of a random inatiiig 
pcqnilaticHi that would lead to 
the same rate of inbreeding as 
the breeding population thai is 
under select ion. Quantifies the 
amount of random change in 
allele and haplotype frequencies 
thai can occu r in the pc^nilalion, 
Vpiiicb can give rise to linkage 
disequiJibiiura. 

EXCmC GENETIC RESOURCE 
Wild, unadapted or non- 
commercial population that can 
be used as a source of new 
genetic material for improved 
populations. 



for a breeding population based on the historical accu- 
mulation of mutations (which gives rise to QTL) at 
locations throughout the genome in the context of a 
high-density marker map. They then computed molec- 
ular scores based on statistical associations of pheno- 
type with marker hapiotypes to capture population- 
wide LD. For populations that are representative of 
livestock with an effective population size of 100, they 
showed that sufficient LD was available and that the 
molecular score had an accuracy of 85% as a predictor 
of the total genetic value of an individual, when marker 
spacing was 1 cM. Accuracy dropped to 81 and 74%, 
respectively, for marker spacings of 2 and 4 cM. 

Fine mapping of QTL will also increase the eflBciency 
of foreground selection in introgression programmes 
because the genomic region that has to be controlled is 
smaller. This will reduce the number of individuals that 
are required and the genotypingcosL In addition, intro- 
gression of a smaller genomic region helps to eliminate 
unwanted genes that are bcated around the target QTL 
This is particularly important when the donor is an 
Exouc GENETIC RESOURCE. Similar considerations also hold 
true for recurrent MAS. 

So, the extensive resources that are required to fine 
map QTL, let alone clone the functional gene, will bene- 
fit genetic improvement programmes only to a degree. 
More detailed knowledge of the functional genes would, 
however, albw a better understanding of the physiology 
of the quantitative trail. This might allow better predic- 
tion of the effects of the QTL in different genetic back- 
grounds and environmental conditions, and on different 
characteristics of performance. In addition, specific 
management strategies could be developed for specific 
genotypes to enhance their performance. 

The economics off marker^assisted selection 

Economics is the key determinant for the application of 
molecular genetics in genetic improvement pro- 
grammes. The use of markers in selection incurs the 
costs that are inherent to molecular techniques. Apart 
firom the cost of QTL detection, which can be substan- 
tial, costs for MAS include the costs of DNA collection, 
genotyping and analysis. The economic assessment of 
MAS is straightforward in some cases, but complex in 
others (TABLE i), and has been addressed in few studies 
(for example, REFS 37,51,58,59). These studies have relied 
primarily on genetic and economic modelling because 
the results are extremely difficult to verify using repli- 
cated experiments. 

Cases in which the economic merit of MAS is clear 
include situations in which molecular costs are more 
than offset by the savings in phenotypic evaluation. 
Examples are the use of markers in genotype building 
programmes, and selection on markers that are in pop- 
ulation-wide LD for traits that are costly to evaluate 
(for example, disease resistance and meat-quality traits 
in animals). In other cases, the ability to select early off 
sets the extra costs that are. associated with MAS. The 
benefits of being able to release new genetic material 
more quickly can be substantial, particularly in com- 
petith^e markets. 
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The economic merit of MAS becomes questionable 
and more difficult to evaluate in cases in which MAS is 
expected to provide greater genetic gain at increased 
costs. This is particularly the case for selection schemes 
that rely on a combination of phenotype and molecular 
score, because molecular costs are in addition to, not in 
place of, phenotypic costs. In such cases, MAS might 
not be economically more advantageous than quantita- 
tive genetic selection, although the economic merit of 
MAS could be restored by reducing the frequency of re- 
evahiation of marker effects^^ Another consideration is 
that the resources allocated to MAS could also be allo- 
cated to enhance phenotypic selection programmes. 
For example, improvement by conventional selection 
could also be enhanced by increasing the number of 
individuals that are tested for phenotypic evaluation^'. 
Further work on the economic evaluation and opti- 
mization of strategies for the use of molecular genetics 
in breeding programmes is required. It is likely that the 
economically optimal use of MAS necessitates a com- 
plete re- think of the design of breeding schemes, as 
described in the previous section. 

Conclusions 

Genetic improvement programmes for livestock and 
crop species can be enhanced by the use of molecular 
genetic information in introgression, genotype buikiing 
and recurrent selection programmes. The prospects for 
MAS are greatest for traits that are difficult to improve 
through conventional means, because of low heritability 
or the difficulty and expense of recording phenotype. 
Recurrent selection using linked markers can be effeaive 
and does not require identification of the functional 
mutations, although some level of fine mapping is 
required, in particular to capitalize on population -wide 
LD. The identification and use of linked markers is based 
on empirical relationships with phenotype, and is, there- 



fore, also limited to some degree by the heritability of the 
trait and the availability of phenotypic data. Phenotypic 
data requirements are lower with the use of population- 
wide LD than with the use of within-family LD. 

Unless genetic markers capture most of the genetic 
variation for the trait, which is far from the case at pre- 
sent, selection must be based on a combination of 
marker and conventional phenotypic data. Although 
several useful genes (primarily Lnked genetic markers) 
have been identified in livestock and crop species, their 
application has been limited and their success inconsis- 
tent, because the genes were not identified in breeding 
populations, or because they interact with other genes 
or the environment. The most effective use of markers 
has been in introgression programmes in plants. Further 
use of MAS might require a substantial redesign of 
breeding programmes, in combination with other tech- 
nologies, such as those associated with reproduction. 

Further advances in molecular technology and 
genome programmes wiD soon create a wealth of infor- 
mation that can be exploited for the genetic improve- 
ment of plants and animals. High- throughput genotyp- 
ing, for example, will allow direct selection on marker 
information based on population- wide LD. Methods to 
effectively analyse and use this information in selection 
are still to be developed. The eventual application of 
these technologies in practical breeding programmes will 
be on the basis of economic grounds, which, along with 
cost-effective technology, wiD require further evidence of 
predictable and sustainable genetic advances using MAS. 
Until complex traits can be fully dissected, the applica- 
tion of MAS will be limited to genes of moderate -to - 
large effect and to appKcations that do not endanger the 
response to conventional selection. Until then, observ- 
able phenotype will remain an important component of 
genetic improvement programmes, because it takes 
account of the coflective effect of aO genes. 
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